Flexible construction of hierarchical scale-free networks with general exponent 
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Extensive studies have been done to understand the principles behind architectures of real net- 
works. Recently, evidences for hierarchical organization in many real networks have also been 
reported. Here, we present a new hierarchical model which reproduces the main experimental prop- 
erties observed in real networks: scale-free of degree distribution P(k) (frequency of the nodes that 
are connected to k other nodes decays as a power-law P{k) ~ k ') and power-law scaling of the 
clustering coefficient C{k) ~ k~^ . The major novelties of our model can be summarized as follows: 
(a) The model generates networks with scale-free distribution for the degree of nodes with general 
exponent 7 > 2, and arbitrarily close to any specified value, being able to reproduce most of the 
observed hierarchical scale-free topologies. In contrast, previous models can not obtain values of 
7 > 2.58. (b) Our model has structural flexibility because (i) it can incorporate various types of 
basic building blocks (e.g., triangles, tetrahedrons and, in general, fully connected clusters of n 
nodes) and (ii) it allows a large variety of configurations (i.e., the model can use more than n — 1 
copies of basic blocks of n nodes). The structural features of our proposed model might lead to a 
better understanding of architectures of biological and non-biological networks. 

PACS numbers: 89.75.-k, 05.65.+b 



Recently, the importance of hierarchical modularity in 
the context of biological networks h, _2, and some non- 
biological networks^ |^ |^ has been pointed out and a 
number of theoretical models has been proposed. On the 
biological side, a major challenge is to understand the re- 
lationships among fundamental elements such as genes, 
proteins and chemical substrates in cells. It is believed 
that some groups of interlinked elements (i.e., functional 
modules) can carry out relevant tasks in a functional 
level 0. These functional modules can be integrated 
into larger groups, generating a hierarchical organization 
|2j. Though experimental work is much more important, 
construction of adequate theoretical models is also impor- 
tant for better understanding of general principles behind 
architectures of biological networks. 

Theoretical models for explaining real complex net- 
works have evolved during the last years, from the classi- 
cal random graph model 0] and the small- world model Q 
to scale- free network models 0,0,^J. The most impor- 
tant feature of scale-free networks is that the degree dis- 
tribution P{k) (frequency of the nodes that are connected 
to k other nodes) decays as a power- law P(k) ~ In 
the earliest models of scale-free networks 11 113, prob- 
abilistic rules were employed to construct networks in- 
crementally. After that, deterministic scale-free mod- 
els introduced in 0, 0| were a step towards simula- 
tion of a modular topology. However, these models lack 
the power-law scaling of C(fc), because their nodes have 
clustering coefficient Ci{ki) = 0. Recently, the modular- 
ity and hierarchical topology Q, H, P| were introduced to 
explain all the observed properties in complex networks. 
These observed properties of real networks with N nodes 
can be summarized as: scale-free of degree distribution 
P{k) ^ k~~' , power-law scaling of the clustering coef- 
ficient C{k) ~ and an independence of the network 
size N and high value for the average of the clustering co- 



efficient C{N). The clustering coefficient for each node 
i is defined as Ci{ki) = 2ni/{ki{ki — 1)), where de- 
notes the number of edges connecting ki neighbors of 
node i, and C{N) reads as C{N) = [T,^C^{h)]/N. Fi- 
nally, the function C(fc) is defined as the average clus- 
tering coefficient over nodes with the same degree k: 
C{k) = [E^:fc,=fc C,{ki)\/N{k), where N{k) is the number 
of nodes of degree k. 
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FIG. I: (a) The RSMOB model Initial cluster with four 
nodes, which are fully connected. After the first replica- 
tion the network consists of 16 nodes (4^ = 16). (b) The 
re-organized structure of (a) to show clearly the similarities 
and differences between the RSMOB model and our proposed 
model, (c) Our proposed hierarchical model up to i = 2. We 
note that only one copy (among four copies) exists with one 
edge connecting to the main hub. The number of such copies 
is not restricted. When the number grows, 7 also increases. 



In "5, ll Ravasz et al. (the RSMOB model in what 
follows) suggested a hierarchical model to incorporate all 
the mentioned observed properties in the same frame- 
work. The model starts with a fully connected module 
of four nodes (the number of nodes in the initial module 
can be different), and four identical copies are created, 
obtaining a network of iV = 16 nodes in the first repli- 
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cation (4^ = 16 nodes). This process can be repeated 
indefinitely. We illustrate the process in Fig. 1(a). It 
is mentioned in that the model follows a power-law 
scaling for C(fc) ^ and holds a scale-free topology 
P{k) - fc-^, with 7 = 1 + (In4)/(ln3) ~ 2.26. By mod- 
ifying the number of nodes in the initial module, the 
value of 7 changes. However, the value is constrained to 
2 < 7 < 1 -I- (In3)/(ln2) ~ 2.58, which indicates a small 
range of possible applications. 

In this article, we propose a new hierarchical model 
which integrates the observed properties of real networks 
in a single framework. The model can generate a scale- 
free topology with exponent 7 > 2, and arbitrarily close 
to any specified value. In addition, our model has struc- 
tural flexibility because it can incorporate various types 
of basic building blocks (e.g., triangles, tetrahedrons), 
which might lead to better understanding of architec- 
tures of biological and non-biological networks. 
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FIG. 2: Topology and construction of our proposed model, 
(a) The model can start with arbitrary number of nodes which 
are fully connected, (b) Considering the initial cluster of 
three nodes, the two leftmost triangles have all their nodes 
connected to the main hub. This configuration is called the 
(2 -I- 2) configuration. The degree of the main hub is calcu- 
lated as fc = 2* -I- 4, where i is the number of iterations, (c) 
Four copies of (b) are made, and one node (the new main 
hub at this iteration) is added. Fig. 2(c) contains four nodes 
as the second intermediate hubs. Each of these hubs holds k 
edges, where kj = [2-* + 4] 4- 1 and j — 2. (d) Following the 
same process, four copies of (c) are created. The process can 
be iterated indefinitely constructing a network with power- 
law P{k) (X fc-^ (e) Sketch of our model considering only 
the main hub with k links and the nodes in the bottom level 
(i.e., non-hub nodes) that are connected to the main hub. 
Since these non-hub nodes are connected by k' /2 edges where 
k' = k — 4:, the clustering coefficient follows C{k) ~ 

In order to explain an example of our model, we look 



at the structure depicted in Fig. 2(b). We see that there 
is a set of four triangles (fully connected cluster of three 
nodes) with upper nodes connected to the main hub. In 
Fig. 2(a) we notice that the initial cluster could have dif- 
ferent structures and could be a fully linked initial cluster 
of 4, 5 nodes or even larger number of nodes. The ini- 
tial cluster corresponds to the iteration of i = 1. Fig. 
2(b) shows the iteration of i = 2 where four copies (the 
number of copies is selected according to the required 7) 
of the initial cluster are created and one node in each 
initial cluster is linked to the main hub. In addition, 
we note that only two out of the four triangles have all 
their vertices connected to the main hub. For brevity, we 
call a node in a copy corresponding to the main hub in 
the j-th iterations an j-th intermediate hub, and call a 
node which is not the main hub or an intermediate hub 
a non-hub node. In Fig. 2(c), we show the network with 
iteration of i = 3. We make four replicas of the network 
in Fig. 2(b) and connect the second intermediate hubs 
in these copies to the main hub. The four non-hub nodes 
with the highest degree among the non-hub nodes in two 
copies are also connected to the main hub. In Fig. 2(d), 
we show the network with iteration of z = 4 which is 
obtained by making four replicas of Fig. 2(c), following 
the same process explained above. This process can be 
iterated indefinitely. The degree distribution of this net- 
work is dominated by the intermediate hubs. There is a 
main hub at the top of the structure and new interme- 
diate hubs appear at each iteration. In Fig. 2(c) we see 
four nodes as the second intermediate hubs. 

Suppose that we have a network via n iterations. It is 
straightforward to see that the degree of the main hub 
is fc = 2" -f 4. Since one edge is appended to the j-th 
intermediate hub at the (j -I- l)-th iteration, the degree 
kj of the j-th intermediate hub will be kj — (2^ + 4) + 1, 
if 2 < j < n. We can see that the total number Nj 
of j-th intermediate hubs will satisfy Nj = 4^""^^ . From 
kj — (2^ +4) + 1, we can write In kj ~ j In 2 and also from 
Nj = 4(""-''), we have InNj = {n ~ j) ln4 = ci - j In 4. 
From these expressions it is straightforward to write: 

-(— ) -2 
IniVj = ci + In/c^ '"^ — ci + In/c^ . Hence, the num- 
ber of hubs with degree k (i.e., distribution of hubs with 
degree k) in the proposed network follows the power-law 
Nj (X kj^ . However, we must notice that in a hierarchi- 
cal network, the number of nodes with different degree 
k is scarce, therefore the probability distribution of node 
degree is properly defined as P{k)={\ / Ntot){N {k) / /S.k) , 
where N{k) is the number of nodes with degree fc, Ntot is 
the total number of nodes, and A/c means that nodes are 
binned into intervals according to degree k. In addition, 
we note that for the hierarchical model, Afc changes lin- 
early with k (i.e., A/cj+i = fc^+i — kj=2^ ~ kj). Hence, 
this linear dependence of A/c makes that the probability 
distribution follows in the proposed network the power- 
law P{k) oc k^^. In general, that binning gives rise to 
7 = l-t-7', where 7' means the exponent of the power-law 
distribution of hubs. 
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The construction can be generalized in the following 
way. We denote by {I + TO)-configuration one such that, 
at each (say the i-th) iteration, I + m copies of the net- 
work at the {i — l)-th iteration are created. With this 
configuration, we construct two types of connections be- 
tween the copies and the main hub at the z-th iteration: 
connections between the {i — l)-th intermediate hubs and 
the main hub, and connections between non-hub nodes 
with the highest degree and the main hub. 

We notice here that this configuration is fiexible and 
can be modified. There are two important and modi- 
fiable factors: (i) the number of copies {I + m) and the 
number of copies {I) for which some of non-hub nodes are 
connected to the main hub, (ii) the basic building blocks 
(e.g, triangle, tetrahedron). The former determines the 
value 7 and the latter affects the structure of network 
architecture. 

Here we describe more about configuration of networks 
in our model. First we consider a configuration which 
is able to reproduce the observed value of 7 = 3.25 in 
language network, which has a hierarchical organization 
0. This network is generated connecting two words 
to each other if they appear as synonyms in the Mer- 
riam Webster dictionary 0|. We construct the network 
with the (2 -I- 3) configuration {kj = (2^ + 5) + 1 and 

Nj =5" ^ ), and we obtain Nj oc fc^- '"^ , where after 
binning we get 7 — 3.3. This value is in good agreement 
with the observed 7 = 3.25, which is not accessible with 
the RSMOB model. The reason is because the RSMOB 
model can only handle the case of m = 1. Next we con- 
sider the general case. With {I + m) configuration, we 
obtain kj = [V + {I + m)] + 1, = [l + rn)"~^ and 

Nj oc 'in^' '1, which indicates that by tuning the pa- 
rameters / and m we have a network with exponent 7, 
which is arbitrarily close to any required value above two. 

From this construction of the hierarchical network we 
have several advantages if we compare with the RSMOB 
model 0. First, 7 can be arbitrarily close to any specified 
value above two, far from the restraints of the RSMOB 
model. Secondly, our procedure to generate the structure 
is more flexible and allows more variety of configurations. 
In Fig. 1(a) we show two iterations of the RSMOB model 
with 4 initial nodes, and in Fig. 1(c) we show our model 
up to « = 2. Fig. 1(b) shows a re-organization of Fig. 
1(a) to point out similarities and main differences be- 
tween the RSMOB model and our proposed model. In 
the setup of Fig. 1, our model provides a dependence for 

the hubs as Nj cc kj , and after binning we obtain 
7 = 1 + (In 4)/ (In 3) ~ 2.26, which is the same result pro- 
vided by the RSMOB model. In addition, we are more 
fiexible with our topology by increasing the number of 
copies. For example, with (3 + 3) configuration, we ob- 

tain Nj oc kj and after binning we get the value 

of 7 = 2 -f (In2)/(ln3) ~ 2.63, which is not accessible 
with the RSMOB model O. 

Evidences for hierarchical organization in many real 
networks (biological and non-biological networks) have 
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FIG. 3: Circles: Distribution of nodes with degree k, 
N{k), normalized to the total number of nodes, Ntot, (i.e., 
N{k) / Ntot)- The network is constructed with the configura- 
tion (2 + 2), up to i = 8 and three nodes as initial cluster 
(triangles as building blocks). Dashed-line: Fit to the circles 
(only the main and intermediate hubs). It shows a power- law 
with exponent 7' =2.28. Triangles: Probability distribution 
P{k)={l/Ntot){N{k)/Ak), where Afc means that hubs with 
degree k are binned into intervals Afcj+i = kj+i — kj = 2-' ~ 
y (i.e., kj < k < kj+i). We note that for degrees k—8 and 
= 9 we used Ak — 2^. From 1 < fc < 7, there are values for 
each fc, and the binning is not required. Squares: Subtracting 
the value 5 in the axis of fc from the triangles (only the main 
and intermediate hubs). Continuous line: Fit to the squares. 
It shows a power-law with exponent 7 ~ 3. 



recently been reported. On the biological side, the 
metabolic network was analysed in 0, lla. Il7.j and the 
results showed that the value of exponent is 7 = 2.2, 
and the clustering coefficient C{k) scales as A:""'^. In 
protein domain networks were analyzed using data from 
different domain databases and scale-free behaviors were 
reported with values of exponents: 7 = 2.5 (ProDom 
database), 7 = 1.7 (Pfam), and 7 = 1.7 (Prosite). A 
protein interaction network of S. cerevisae was studied in 
|19| and it was found that 7 = 2.5. In the hierarchi- 
cal signature of this network was revealed showing that 
C{k) scales as fc~^. From non-biological networks, we can 
also find some examples which hold a scale-free topology 
integrated in the hierarchical organization Here, we 
only mention the type of network and the corresponding 
value of 7: ■y — 2.3 for actor network j^^, ^out = 2.45 
and Jin = 2.1 (denoting the out and in-degree distribu- 
tion respectively) for World Wide Web [l^, 7 = 2.1 2.2 
21J for Internet at the AS level (interdomain level), and 
7 = 3.25 for language network 4]. In all these cases the 
scaling of C{k) suggests the hierarchical organization 
For these examples with 7 > 2, our model is able to gen- 
erate the scale-free topology with exponents arbitrarily 
close to the values shown above. 

In Fig. 3 we show the degree distribution of our model 
with (2 -|- 2) configuration, up to i = 8. As we explained 
before, the tail of that distribution (hubs) should follow 
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FIG. 4: The clustering coefficient C(k) evaluated with the 
configurations [2 + 2, up to 6 iterations] (circles) and [2 + 3, up 
to 5 iterations] (squares). In both cases, the building blocks 
are triangles. 

a power-law. Dashed-line indicates one which fits the de- 
gree of the hubs of our generated network. The meaning 
of this line is just distribution of nodes normalized to 
the total number of nodes. We see that the value of 7' 
is slightly different from the theoretical value of 2, but 
the difference comes from the approximation made from 
k,j = (2^ + 4) + 1 to In kj ~ j In 2. If we plot the dots after 
subtracting 5 units in the axis of k and we fit them, we 
could find exactly 7' = 2, indicating that the difference 
between both results was coming from that approxima- 
tion. However, we are interested in the probability distri- 
bution of node degree P{k)^{l/Ntot)iN{k)/Ak). In Fig. 
3 wc show the probability distribution (triangles) after 
binning is applied for the hubs. In addition, we plot the 
probability distribution of the hubs after subtracting 5 
units in the axis of k (squares). The continuous line is 
fitted to the squares and it shows a power-law probability 
distribution with exponent 7 = 3. 

It is worth noticing that we can also reproduce the 
distribution without explicit construction of the network. 
If we compute the values of 2^ + 5 (degree of hubs) versus 
the values of 4("^-'^ (the number of copies) for j = 1, .., n 
and n = 20, we can obtain the power-law corresponding 
to 7' = 2 for the distribution of nodes and 7 = 3 for the 
probability distribution after binning. It indicates that 



by generating a larger number of iterations in our model 
we are able to obtain exactly the predicted exponents. 

In Fig. 4, we calculate C{k) for the (2 + 2) and (2 + 3) 
configurations in our model and we see the power-law 
scaling of C{k) ~ k~^, which is also a key feature of the 
hierarchical network. In Fig. 1(e) we show a sketch of 
our model considering only the main hub with k' ik' = 
k — I — m) edges to non-hub nodes. It is seen that there 
are k' 12 edges among the non-hub nodes. From this, it is 
straightforward to see that the clustering coefhcient for 
non-hub nodes is: C{k) = {k'/2)/[{k{k - l))/2] ~ 
showing the power-law scaling for the degree of clustering 
in our model. Concerning the average of the clustering 
coefficient C{N), its behavior in our model is indepen- 
dent of the network size iV as a consequence of the power- 
law scaling of C{k) in agreement with the observed 
properties in metabolic networks 0- 

It is interesting to note that our model holds a similar- 
ity with the model in |^ Il4| , in particular with the pref- 
erential attachment feature. In that model, new nodes 
are being added in time step t, and the probability that 
the new node is connected to an already present node i 
depends on the degree ki of that node {h/ J2j ^j)- 
we can see in Fig. 2, in each iteration we are adding a 
new node (main hub) plus copies of previous structures. 
The new hub is connected deterministically to the nodes 
in the non-hubs but only to those ones which have higher 
degree f2^ . In that sense, a remanence of the preferen- 
tial attachment concept is held in our model though the 
degree distribution for the non-hub nodes does not follow 
the power-law as in the RSMOB model. 

In conclusion, we have presented here a new model to 
reproduce the main features of the hierarchical organiza- 
tion, which is one of the central challenges in the field of 
network science. Our model holds important properties 
as structural flexibility and its more general capability to 
generate values of 7 > 2, being able to reproduce most of 
the observed scale-free topologies, even in networks with 
exponents above 7 = 2.58, where the RSMOB model 
fails. Therefore, our model might be a useful tool to 
uncover the hierarchical features in biological and non- 
biological networks in a broader scope. 
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